# Use Docker Compose

<!--
SPDX-FileCopyrightText: 2023 LakeSoul Contributors

SPDX-License-Identifier: Apache-2.0
-->

## Docker Compose Files
We provide a docker compose env to quickly start a local PostgreSQL service and a MinIO S3 Storage service. The docker compose env is located under [lakesoul-docker-compose-env](https://github.com/lakesoul-io/LakeSoul/tree/main/docker/lakesoul-docker-compose-env).

## Install Docker Compose
To install docker compose, please refer to [Install Docker Engine](https://docs.docker.com/engine/install/)

## Start docker compose
To start the docker compose env, cd into the docker compose env dir, and execute the command:
```bash
cd docker/lakesoul-docker-compose-env/
docker compose up -d
```
Then use `docker compose ps` to check both services' statuses are `running(healthy)`. The PostgreSQL service would automatically setup the database and tables required by LakeSoul Meta. And the MinIO service would setup a public bucket. You can change the user, password, database name and MinIO bucket name accordingly in the `docker-compose.yml` file.

## Run LakeSoul Tests in Docker Compose Env
### Prepare LakeSoul Properties File
```ini title="lakesoul.properties"
lakesoul.pg.driver=com.lakesoul.shaded.org.postgresql.Driver
lakesoul.pg.url=jdbc:postgresql://lakesoul-docker-compose-env-lakesoul-meta-db-1:5432/lakesoul_test?stringtype=unspecified
lakesoul.pg.username=lakesoul_test
lakesoul.pg.password=lakesoul_test
```
### Prepare Spark Image
You could use bitnami's Spark 3.3 docker image with packaged hadoop denendencies:
```bash
docker pull bitnami/spark:3.3.1
```

### Start Spark Shell
```bash
docker run --net lakesoul-docker-compose-env_default --rm -ti \
    -v $(pwd)/lakesoul.properties:/opt/spark/work-dir/lakesoul.properties \
    --env lakesoul_home=/opt/spark/work-dir/lakesoul.properties bitnami/spark:3.3.1 \
    spark-shell \
    --packages com.dmetasoul:lakesoul-spark:2.3.0-spark-3.3 \
    --conf spark.sql.extensions=com.dmetasoul.lakesoul.sql.LakeSoulSparkSessionExtension \
    --conf spark.sql.catalog.lakesoul=org.apache.spark.sql.lakesoul.catalog.LakeSoulCatalog \
    --conf spark.sql.defaultCatalog=lakesoul \
    --conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
    --conf spark.hadoop.fs.s3a.buffer.dir=/opt/spark/work-dir/s3a \
    --conf spark.hadoop.fs.s3a.path.style.access=true \
    --conf spark.hadoop.fs.s3a.endpoint=http://minio:9000 \
    --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider
```

### Execute LakeSoul Scala APIs
```scala
val tablePath= "s3://lakesoul-test-bucket/test_table"
val df = Seq(("2021-01-01",1,"rice"),("2021-01-01",2,"bread")).toDF("date","id","name")
df.write
  .mode("append")
  .format("lakesoul")
  .option("rangePartitions","date")
  .option("hashPartitions","id")
  .option("hashBucketNum","2")
  .save(tablePath)
```

### Verify Data Written Successfully
Open link http://127.0.0.1:9001/buckets/lakesoul-test-bucket/browse/ in your browser to verify that LakeSoul table has been written to MinIO successfully.
Use minioadmin1:minioadmin1 to login into MinIO's console.

## Cleanup Meta Tables and MinIO Bucket
To cleanup all contents in LakeSoul meta tables, execute:
```bash
docker exec -ti lakesoul-docker-compose-env-lakesoul-meta-db-1 psql -h localhost -U lakesoul_test -d lakesoul_test -f /meta_cleanup.sql
```
To cleanup all contents in MinIO bucket, execute:
```bash
docker run --net lakesoul-docker-compose-env_default --rm -t bitnami/spark:3.3.1 aws --no-sign-request --endpoint-url http://minio:9000 s3 rm --recursive s3://lakesoul-test-bucket/
```

## Shutdown Docker Compose Env
```bash
cd docker/lakesoul-docker-compose-env/
docker compose stop
docker compose down
```